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Recommendations 

Major Recommendations 

Definitions for the strength of the recommendations (Strong Recommendation or Qualified 
Recommendation) are provided at the end of the "Major Recommendations" field. 

American Cancer Society CACSJ Guideline for Breast Cancer Screening, 2015 

These recommendations represent guidance from the ACS for women at average risk of breast cancer: 
women without a personal history of breast cancer, a suspected or confirmed genetic mutation known to 
increase risk of breast cancer (e.g., BRCA), or a history of previous radiotherapy to the chest at a young 
age. 

The ACS recommends that all women should become familiar with the potential benefits, limitations, and 
harms associated with breast cancer screening. 






Recommendations 


Women with an average risk of breast cancer should undergo regular screening mammography 
starting at age 45 years. {Strong Recommendation) 

Women aged 45 to 54 years should be screened annually. {Qualified Recommendation) 

Women 55 years and older should transition to biennial screening or have the opportunity to 
continue screening annually. {Qualified Recommendation) 

Women should have the opportunity to begin annual screening between the ages of 40 and 44 
years. {Qualified Recommendation) 

Women should continue screening mammography as long as their overall health is good and they 
have a life expectancy of 10 years or longer. {Qualified Recommendation) 

The ACS does not recommend clinical breast examination for breast cancer screening among average- 
risk women at any age. {Qualified Recommendation) 

Definitions 

Interpretation of Strong and Qualified Recommendations by Users of the Guideline 



Strong 

Recommendations 

Qualified Recommendations 

For 

patients 

Most individuals in this 
situation would want the 
recommended course of 
action, and only a small 
proportion would not. 

The majority of individuals in this situation would want the 
suggested course of action, but many would not. Patient 
preferences and informed decision making are desirable for 
making decisions. 

For 

clinicians 

Most individuals should 
receive the recommended 
course of action. 

Adherence to this 
recommendation according 
to the guideline could be 
used as a quality criterion 
or performance indicator. 

Clinicians should acknowledge that different choices will be 
appropriate for different patients and that clinicians must help 
each patient arrive at a management decision consistent with 
her or his values and preferences. Decision aids may be useful 
to help individuals In making decisions consistent with their 
values and preferences. Clinicians should expect to spend 
more time with patients when working toward a decision. 


Clinical Algorithm(s) 

None provided 


Scope 

Disease/Condition(s) 

Breast cancer 


Guideline Category 

Screening 


Clinical Specialty 

Family Practice 
Internal Medicine 


Obstetrics and Gynecology 








Oncology 

Preventive Medicine 


Intended Users 

Advanced Practice Nurses 
Health Care Providers 
Health Plans 

Managed Care Organizations 

Nurses 

Patients 

Physician Assistants 
Physicians 

Public Health Departments 


Guideline Objective(s) 

To update the American Cancer Society (ACS) 2003 breast cancer screening guideline for women at 
average risk for breast cancer 


Target Population 

Women at average risk of breast cancer 


Interventions and Practices Considered 

1. Screening mammography at annual or biennial intervals 

2. Clinical breast examination for screening (not recommended) 

Major Outcomes Considered 

Critical Outcomes 

Breast cancer mortality (breast cancer deaths prevented by screening) 

Quality of life (quality-adjusted life-years gained by screening) 

Life expectancy (life-years gained by screening) 

False-positive findings (recall for additional testing [imaging and/or biopsy] after abnormal clinical 
breast examination or mammography, in which further evaluation determines that the initial 
abnormal finding was not cancer) 

Overdiagnosis (screen-detected cancers that would not have led to symptomatic breast cancer if 
undetected by screening) 

Overtreatment (cancer therapies [surgery, radiation, chemotherapy] performed for screen-detected 
cancers that would not have led to symptomatic breast cancer if undetected by screening) 


Important But Not Critical Outcomes 


Breast cancer stage (tumor characteristics at diagnosis, inciuding stage, tumor size, and nodal 
status) 

Short- and long-term emotional effects (anxiety, depression, quality of life associated with positive 
results) 


Methodology 

Methods Used to Collect/Select the Evidence 

Hand-searches of Published Literature (Secondary Sources) 
Searches of Electronic Databases 


Description of Methods Used to Collect/Select the Evidence 

The American Cancer Society (ACS) Guideline Development Group (GDG) selected the Duke University 
Evidence Synthesis Group to conduct an independent systematic evidence review (see the "Availability of 
Companion Documents" field) of the breast cancer screening literature, after a response to a request for 
proposals. In addition, the ACS commissioned the Breast Cancer Surveillance Consortium (BCSC) to 
update previously published analyses related to the screening interval and outcomes. The ACS 
Surveillance and Health Services Research Program provided supplementary data on disease burden using 
data from the Surveillance, Epidemiology, and End Results (SEER) Program. 

Parameters of Review 

With input from the ACS and GDG, the reviewers developed specific key questions (KQs) relevant to 
breast cancer screening and specified patients, interventions, comparators outcomes, timing, and settings 
(PICOTS) for each question (see Box 1 in the original guideline document). In their review, they focused 
on the results for women at "average" risk for breast cancer, defined as the absence of a known 
susceptibility gene mutation (e.g., BRCA1/BRCA2)-, history of previous breast cancer or ductal carcinoma 
in situ (DCIS); family history of breast cancer; or history of lobular neoplasia, proliferative lesions on prior 
biopsy, or chest irradiation. Abstracts and full-text articles were screened for descriptions of the subject 
population and excluded if women at higher risk were either exclusively included or explicitly included but 
results not reported separately. Population-based studies that did not report results separately were not 
excluded. The review reports on results for KQ 1 (mammography screening vs no screening), KQ 2 
(mammography screening at different intervals), and KQ 3 (clinical breast examination [CBE] with or 
without mammography). The review identified almost no evidence relevant to KQ 4 (screening vs no 
screening in women at high risk of breast cancer) or KQ 5 (screening at different intervals for high-risk 
women). 

Search Strategy 

The reviewers searched PubMed (to March 6, 2014), CINAHL (to September 10, 2013), and PsycINFO (to 
September 10, 2013). No earlier date limit was used for randomized clinical trials (RCTs); for 
observational studies, the group searched for all citations published after January 1, 2000. "Gray 
literature" was not searched nor were attempts made to identify unpublished studies. An experienced 
search librarian advised on all searches. Exact search strings are included in in eAppendix 1 in the 
systematic review supplement (eTables 1, 2, and 3). Hand searches of 4 systematic reviews of RCTs and 
3 systematic reviews of observational studies were performed to ensure that all these studies were 
included. All citations were imported into an electronic database (EndNote X4; Thomson Reuters) that 
was also used for recording screening decisions and data extraction. 


Study Selection 





The reviewers developed specific inclusion/exclusion criteria (see Box 2 in the systematic review) that 
were used by 2 investigators to independently review titles and abstracts for potential relevance to the 
KQs. Articles included by either reviewer underwent full-text screening. At the full-text review stage, 
paired researchers independently reviewed the articles and indicated a decision to "include" or "exclude" 
the article for data abstraction. When the 2 reviewers arrived at different decisions about whether to 
include or exclude an article, they reconciled the difference through review and discussion, or through a 
third-party arbitrator if needed. Full-text articles meeting the eligibility criteria were included for data 
abstraction. 

Refer to the "Search Results" section of the systematic review for a discussion on the results of the 
literature search. 


Number of Source Documents 

After applying inclusion/exclusion criteria, 160 articles representing 93 studies passed full-text screening 
and were included for abstraction. See Figure 2 in the Duke evidence synthesis report fora literature flow 
diagram (see the "Availability of Companion Documents" field). 


Methods Used to Assess the Quality and Strength of the Evidence 

Weighting According to a Rating Scheme (Scheme Given) 


Rating Scheme for the Strength of the Evidence 

Overall Quality of the Body of Evidence Using Grading of Recommendations Assessment, Development 
and Evaluation (GRADE1 

Fligh — The reviewers are very confident that the true effect lies close to that of the estimate of the 
effect. (Alternative: Further research is very unlikely to change confidence on the estimate of effect.) 

Moderate — The reviewers are moderately confident in the effect estimate; the true effect is likely to be 
close to the estimate of effect, but there is a possibility that it is substantially different. (Alternative: 
Further research is likely to have an important impact on confidence in the estimate of effect and may 
change the estimate.) 

Low — Confidence in the effect estimate is limited: the true effect may be substantially different from 
the estimate of the effect. (Alternative: Further research is very likely to have an important impact on 
confidence in the estimate of effect and is likely to change the estimate.) 

Very low — The reviewers have very little confidence in the effect estimate; the true effect is likely to be 
substantially different from the estimate of effect. (Alternative: Evidence on an outcome is absent or too 
weak, sparse, or inconsistent to estimate an effect.) 


Methods Used to Analyze the Evidence 

Review of Published Meta-Analyses 
Systematic Review with Evidence Tables 


Description of the Methods Used to Analyze the Evidence 

The American Cancer Society (ACS) Guideline Development Group (GDG) selected the Duke University 
Evidence Synthesis Group to conduct an independent systematic evidence review (see the "Availability of 


Companion Documents" field) of the breast cancer screening literature, after a response to a request for 
proposals. In addition, the ACS commissioned the Breast Cancer Surveillance Consortium (BCSC) to 
update previously published analyses related to the screening interval and outcomes. The ACS 
Surveillance and Health Services Research Program provided supplementary data on disease burden using 
data from the Surveillance, Epidemiology, and End Results (SEER) Program. 

Prior to the review, the GDG had agreed to use the Grading of Recommendations Assessment, 
Development and Evaluation (GRADE) framework to assess quality of evidence and resulting certainty or 
uncertainty about benefit and harm to formulate specific recommendations. Decisions about which 
outcomes were "critical" in the context of GRADE (e.g., all false-positive results vs false-positive biopsy 
results) were made by the GDG. 

Data Abstraction 

Based on clinical and methodological expertise, a pair of investigators were assigned to abstract data 
from each eligible article. One investigator abstracted the data, and the second reviewed the completed 
abstraction form alongside the original article to check for accuracy and completeness. Disagreements 
were resolved by consensus or by obtaining a third reviewer's opinion if consensus could not be reached. 
In addition to specific study characteristics, individual study limitations (risk of bias) were rated using a 
4-point scale from very low to high quality using the GRADE methodology. 

Qualitative Evidence Synthesis 

The reviewers summarized results and methodological limitations of included studies, noted qualitative 
patterns or inconsistencies, and identified common themes and potential explanations for observed 
patterns or inconsistencies. 

Quantitative Evidence Synthesis 

Four high-quality systematic reviews or meta-analyses published within the past 4 years have 
synthesized the available data, particularly for breast cancer mortality, and have reported roughly similar 
results. Prior to beginning their review, the Evidence Synthesis Group planned to conduct a new meta¬ 
analysis only if any additional literature was substantially different in results from previous studies or 
would substantively improve the ability to grade the quality of evidence for a particular outcome (because 
it would substantially improve the precision of the estimate of association with harm or benefit). As of 
March 6, 2014, no updated evidence from included studies, new evidence from other studies, or new 
evidence for outcomes not amenable to quantitative synthesis in previous reviews (such as 
overdiagnosis) were identified, and it was judged that additional meta-analyses would not substantially 
help the GDG resolve uncertainties about the evidence. 

Absolute associations of screening (particularly mortality reduction and cumulative false-positive rates) 
were estimated using both the results of simulation-based modeling reported in the literature and results 
of simpler models (see eAppendix 2 in the systematic review supplement). Published population-based 
data on incidence, mortality, and survival, screening prevalence, and estimates of false-positive outcome 
were used. To estimate the absolute reduction in mortality associated with screening over 15 years, the 
reviewers used an approach similar to that of a previous model (see eAppendix 2 in the systematic review 
supplement), based on estimates of prevalence of screening, the mortality reduction associated with 
screening (across a wide range of estimates), and observed mortality. However, in contrast to the 
previous approach, the reviewers used age-specific incidence-based mortality rather than age-specific 
mortality. Because breast cancer deaths at a given age reflect cases diagnosed both at that age and 
younger ages, estimating the effects of screening using this approach will not capture the potential effect 
of screening prior to the beginning of a given age interval (for example, applying a given risk reduction 
associated with mammography to 50- to 59-year-old women will include deaths attributable to cancer 
diagnosed prior to age 50 years). Using incidence-based mortality results in somewhat lower estimates of 
number needed to screen (NNS), compared with age-specific mortality at a given estimate of screening 
effectiveness. 


Qverall Quality Rating 



The reviewers graded the overall quality of the body of evidence for each outcome per Key Question (KQ) 
based on the specific criteria outlined by GRADE (see eAppendix 1 in the systematic review supplement 
[eTables 5 and 6]). There is no explicit "formula" for grading strength of evidence when data are available 
from both randomized controlled trials (RCTs) and observational studies, particularly when, as is the case 
with breast cancer screening, differences exist in the magnitude of association across different study 
designs and factors other than study internal validity or risk of bias, such as secular trends in incidence, 
screening technology, and treatment effectiveness, may influence the applicability of the evidence to the 
population of interest. For each outcome per KQ, the Evidence Synthesis Group provided an assessment 
of the overall strength of evidence across all included study designs by assessing 4 domains: (1) risk of 
bias (graded primarily by study design, with RCTs having the lowest risk of bias, and, within study 
designs, by factors such as method of randomization, adequacy of adjustment for potential confounding, 
and plausible direction of unmeasured confounding); (2) consistency (graded primarily on consistency in 
the direction of association—e.g., did studies consistently show a reduction in breast cancer mortality 
across a range of study designs and settings); (3) directness (graded based on whether the study directly 
measured the outcome of interest, rather than a direct measure of a surrogate or an estimate of the 
outcome based on the modeling of surrogates, and by applicability to screening as currently practiced in 
the United States); and (4) precision, which primarily affected the estimate of the magnitude (strength) 
of association—for example, it would be possible for evidence for a particular outcome to be considered 
high quality in terms of consistency if all studies showed the same direction of association (e.g., 
decreased breast cancer mortality) but low quality for the magnitude of association if results varied 
substantially within or across study designs or settings. If available, results from meta-analyses were 
used when evaluating consistency (forest plots, tests for heterogeneity), precision (confidence intervals), 
and strength of association (weighted mean difference). These domains were considered qualitatively, 
and a summary rating of high, moderate, low, or very low strength of evidence was assigned after 
discussion by 2 investigators using a 4-level scale from high to very low. Any disagreement was resolved 
through consensus. 

Much of the available evidence on the outcomes of specific screening strategies, particularly for the 
United States, is derived from published studies of simulation modeling. GRADE does not provide explicit 
guidance on how to weight modeling studies. Even the most sophisticated modeling study will be limited 
by the strength of the evidence available for the most important parameters. In general, because 
modeling is often most useful for addressing questions for which direct evidence is difficult to obtain 
(comparing a large number of different screening intervals and starting and stopping ages), and because 
virtually all models require assumptions or imputed values (such as the progression rate of undetected 
cancer) to produce usable results (such as estimates of cancer deaths prevented). Therefore, the 
Evidence Synthesis Group assumed that modeling studies could be no higher than moderate quality. As 
part of the total body of evidence, modeling studies raised quality if they contributed to improved 
consistency of results (e.g., if model-based estimates of mortality reduction were consistent with 
observational studies that were not used to provide inputs into the model). 


Methods Used to Formulate the Recommendations 

Expert Consensus 


Description of Methods Used to Formulate the Recommendations 

The Process 

In accordance with the new guideline development process, the American Cancer Society (ACS) organized 
an interdisciplinary Guideline Development Group (GDG) consisting of clinicians (n = 4), biostatisticians 
(n = 2), epidemiologists (n = 2), an economist (n = l), and patient representatives (n = 2). After evaluating 
available methods to grade the evidence and the strength of recommendations, the GDG selected the 
Grading of Recommendations Assessment, Development and Evaluation (GRADE) system. GRADE is an 
accepted approach with a defined analytic framework, an explicit consideration of values and preferences 



addressing patient-centered outcomes, the capacity for flexibility in evaluating results from observational 
studies, and separation between quality of evidence and strength of recommendation. 

The ACS GDG selected the Duke University Evidence Synthesis Group to conduct an independent 
systematic evidence review of the breast cancer screening literature, after a response to a request for 
proposals. This effort is referred to as the evidence review. In addition, the ACS commissioned the Breast 
Cancer Surveillance Consortium (BCSC) to update previously published analyses related to the screening 
interval and outcomes. The ACS Surveillance and Health Services Research Program provided 
supplementary data on disease burden using data from the Surveillance, Epidemiology, and End Results 
(SEER) Program. 

The GDG deliberations on the evidence and framing of the recommendations were guided by the GRADE 
domains: the balance between desirable and undesirable outcomes, the diversity in women's values and 
preferences, and confidence in the magnitude of the effects on outcomes. The GDG chose to assess 
recommendations as "strong" or "qualified," in accordance with GRADE guidance. A strong 
recommendation conveys the consensus that the benefits of adherence to the intervention outweigh the 
undesirable effects. Qualified recommendations indicate there is clear evidence of benefit but less 
certainty about either the balance of benefits and harms, or about patients' values and preferences, 
which could lead to different decisions (see the "Rating Scheme for the Strength of the 
Recommendations" field). 

The GDG members voted on agreement or disagreement with each recommendation and on the strength 
of recommendation. A record of the vote with respect to each question was made without attribution. The 
panel attempted to achieve 100% agreement whenever possible, but a three-quarters majority was 
considered acceptable. 


Rating Schenne for the Strength of the Recommendations 

Interpretation of Strong and Qualified Recommendations bv Users of the Guideline 



Strong 

Recommendations 

Qualified Recommendations 

For 

patients 

Most individuals in this 
situation would want the 
recommended course of 
action, and only a small 
proportion would not. 

The majority of individuals in this situation would want the 
suggested course of action, but many would not. Patient 
preferences and informed decision making are desirable for 
making decisions. 

For 

clinicians 

Most individuals should 
receive the recommended 
course of action. 

Adherence to this 
recommendation according 
to the guideline could be 
used as a quality criterion 
or performance indicator. 

Clinicians should acknowledge that different choices will be 
appropriate for different patients and that clinicians must help 
each patient arrive at a management decision consistent with 
her or his values and preferences. Decision aids may be useful 
to help individuals in making decisions consistent with their 
values and preferences. Clinicians should expect to spend 
more time with patients when working toward a decision. 


Cost Analysis 

The evidence review team excluded studies reporting economic outcomes only. The guideline developers 
considered the Grading of Recommendations Assessment, Development and Evaluation (GRADE) domains 
of the balance between desirable and undesirable patient important outcomes, the diversity in women's 
values and preferences, and confidence in the magnitude of the effects on outcomes. Resource use and 
cost were not factors in decisions about recommendations. 


Method of Guideline Validation 








External Peer Review 


Internal Peer Review 


Description of Method of Guideline Validation 

Prior to submitting the final guideline for publication, 26 relevant outside organizations and 22 expert 
advisors were invited to participate in an external review of the guideline. Responses were documented 
and reviewed by the Guideline Development Group (GDG) to determine if modifications in the narrative or 
recommendations were warranted. 

The American Cancer Society (ACS) Mission Outcomes Committee and Board of Directors reviewed and 
approved the guideline. Final decisions were the responsibility of the GDG. 


Evidence Supporting the Recommendations 

Type of Evidence Supporting the Recommendations 

Studies were included in the evidence synthesis if they met the following inclusion criteria: 

Controlled studies, including randomized controlled trials (RCTs), pooled patient-level meta¬ 
analyses, systematic reviews, and study-level meta-analyses. 

Observational studies (prospective and retrospective cohort studies, incidence-based mortality 
studies, case-control studies, or cross-sectional studies) published since 2000 that included 1000 or 
more average-risk women. 

Modeling/simulation studies, because these studies may be the only way to generate estimates of 
long-term outcomes associated with screening that are not adequately addressed by the RCTs or 
using modern technology and protocols. 


Benefits/Harms of Implementing the Guideline 
Recommendations 

Potential Benefits 

Mammography screening has been shown to be associated with a reduction in breast cancer mortality 
across a range of study designs, including randomized controlled trials (RCTs) and observational studies 
(trend analyses, cohort studies, and case-control studies), with most studies demonstrating a significant 
benefit (see Table 3 in the original guideline document for a table of estimated relative reduction in 
breast cancer mortality associated with mammography screening, by study design among pooled studies). 


Potential Harms 

• False-positive findings are common in breast cancer screening. The most common outcome of a 
false-positive finding is being recalled for additional imaging. 

• While the Guideline Development Group (GDG) recognizes that overdiagnosis represents the greatest 
possible harm associated with screening because it would result in overtreatment, uncertainty about 
the magnitude of the risk of overdiagnosis poses a challenge to providing complete and accurate 
information to women about what to expect from breast cancer screening. 

• When making decisions on screening intervals, it is important to consider the harm-benefit trade-off. 
While annual screening yielded a larger reduction in breast cancer mortality than biennial screening. 


a more frequent screening schedule also resulted in a higher rate of false-positive findings. 

• Women in poor health or with severe comorbid conditions and limited life expectancy may be more 
vulnerable to harms of screening, including anxiety and discomfort associated with additional testing 
and risk of overdiagnosis (due to increased risk of dying from non-breast cancer-related causes) as 
well as to harms from breast cancer treatment. Thus, health and life expectancy, not simply age, 
must be considered in screening decisions. 


Qualifying Statements 

Qualifying Statements 

The American Cancer Society (ACS) had an advisory role in the design and conduct of the study; 
collection, management, analysis, and interpretation of the data; preparation, review, and approval of the 
manuscript; and decision to submit the manuscript for publication. 

Limitations 

There are invariably gaps between the available evidence and the evidence needed for the development 
of guidelines that precisely quantify and weigh the benefits vs the harms associated with breast cancer 
screening. The Guideline Development Group (GDG) synthesized evidence from a variety of sources, 
including the randomized controlled trials (RCTs), observational studies of modern service screening, and 
modeling studies. Still, even after broadening the evidence base, gaps remain. Empirical comparisons of 
screening programs that differ in terms of their ages to start and stop screening, and in their intervals 
between screening examinations, generally were lacking. Further, most breast screening studies did not 
provide estimates of benefits and harms over a lifetime horizon, which is important when considering 
policies that will span several decades or more of an individual's lifetime. The value and applicability of 
meta-analysis of mammography screening RCTs to guide current health policy also should be kept in 
perspective. While the RCT evidence demonstrated the efficacy of mammography screening, these studies 
were conducted from the 1960s through the 1990s with varying protocols, most using older screen-film 
systems and often using single-view mammography. The RCTs demonstrated a range of outcomes in 
terms of mortality reductions and, importantly, in terms of the degree to which an invitation to screening 
was associated with a reduced risk of being diagnosed with an advanced breast cancer, which is strongly 
associated with reduced breast cancer mortality. Overall and age-specific mortality reduction estimates 
derived from meta-analysis of intention-to-treat results do not reveal these differences in the 
performance of the trials. In addition, RCT estimates based on intention-to-treat analyses are influenced 
by nonadherence to the protocol by both the invited and control group. In these respects, meta-analysis 
results are a sound basis forjudging the efficacy of mammography screening, but a poor basis for 
estimating the effectiveness of modern high-quality screening, especially when calculating absolute 
benefits and harms. 

Refer to the "Limitations" section of the original guideline document for additional discussion. 


Implementation of the Guideline 

Description of Implementation Strategy 

An implementation strategy was not provided. 


Implementation Tools 


Patient Resources 



staff Training/Competency Material 


For information about availability, see the Availability of Companion Documents and Patient Resources 
fields below. 


Institute of Medicine (lOM) National Healthcare Quality 
Report Categories 

lOM Care Need 

staying Healthy 


lOM Domain 

Effectiveness 
Patient-centered ness 
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